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Abstract. Intensity mapping of neutral hydrogen (HI) is a promising observational probe 
of cosmology and large-scale structure. We present wide field simulations of HI intensity 
maps based on N-body simulations of a 2.6Gpc/h box with 2048^ particles (particle mass 
1.6 X 10^^ MQ/h). Using a conditional mass function to populate the simulated dark matter 
density field with halos below the mass resolution of the simulation (10® MQ/h < Mhaio < 
10^® MQ/h), we assign HI to those halos according to a phenomenological halo to HI mass 
relation. The simulations span a redshift range of 0.35 ^ z < 0.9 in redshift bins of width 
Az ~ 0.05 and cover a quarter of the sky at an angular resolution of about 7b We use 
the simulated intensity maps to study the impact of non-linear effects and redshift space 
distortions on the angular clustering of HI. Focusing on the autocorrelations of the maps, 
we apply and compare several estimators for the angular power spectrum and its covariance. 
We verify that these estimators agree with analytic predictions on large scales and study the 
validity of approximations based on Gaussian random fields, particularly in the context of the 
covariance. We discuss how our results and the simulated maps can be useful for planning 
and interpreting future HI intensity mapping surveys. 
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1 Introduction 


Intensity mapping of atomic and molecnlar line transitions is emerging as a promising probe of 
cosmological evolntion as well as of the connection between galaxy evolntion and the growth 
of large-scale strnctnre in the Universe [1-10]. The ability to detect a line over wide areas of 
the sky implies that one has accnrate redshift information over large cosmological volnmes and 
forecasts indicate great potential for the recovery of cosmological information from intensity 
mapping snrveys [11-15]. The 21cm line of nentral hydrogen (HI) is a particnlarly promising 
fntnre probe, being the target of several dedicated efforts nsing radio telescopes snch as 
BAOBAB [16], BAORadio [17], BINGO [18], and CHIME [19]. A snccessfnl HI intensity 
mapping experiment will map the integrated emission of HI within the beam of the instrnment 
over large parts of the sky. Dne to the high freqnency resolntion of radio telescopes, snch 
experiments wonld prodnce maps of the large-scale strnctnre of HI in the Universe within 
thin redshift slices. The angnlar and redshift resolntion in these snrveys will correspond to 
comparable physical scales. 

Predicting the constraining power of 21 cm experiments reqnires accnrate knowledge of 
the expected signal. Several approaches have been developed for this pnrpose. Analytic 
methods based on pertnrbation theory and the halo model can be nsed to predict some of 
the statistical properties of the signal, snch as the power spectrnm [11, 14, 18, 20, 21]. For 
a fast prodnction of wide held intensity maps, simnlations have been developed based on 
log-normal random helds [22]. A nnmber of different nnmerical approaches have also been 
nsed to nnderstand the effect of non-linear and baryonic processes on the signal. They range 
from hydrodynamical simnlations [23, 24] that model both collisionless dark matter and the 
hydrodynamics of baryons to approaches where the HI is simply assigned to resolved halos 
from dark matter only simnlations [25, 26]. 

In this paper, we follow an alternative and complementary approach by simnlating low- 
redshift HI intensity maps (z < 1) nsing a combination of dark matter helds from N-body 
simnlations and a halo model prescription for assigning HI to snb-resolntion dark matter 
halos. The advantage of this approach is that it prodnces maps that have both realistic 
clnstering from N-body simnlations and large volnmes, as necessary for fntnre HI intensity 
mapping experiments. For this pnrpose, we nse N-body simnlations of a 2.6 h~^ Gpc box [27] 
with a mass resolntion of 1.6 x 10^^ h“^ Below this mass resolntion, we model the halo 
distribntion nsing a conditional halo mass fnnction [28]. We then generate HI intensity maps 
by assigning HI mass to the halos following a phenomenological prescription [21]. 

Using the maps, we stndy the wide held clnstering properties of the HI intensity hnc- 
tnations, focnsing in particnlar on the impact of non-linearities. HI intensity maps are char¬ 
acterized by having both the wide held and the relatively low angnlar resolntion of Gosmic 
Microwave Backgronnd (GMB) maps and the three dimensional, non-Ganssian large-scale 
strnctnre of galaxy snrveys. We therefore measnre the angnlar power spectrnm of the HI 
intensity maps nsing the psendo estimator [29] and the pnblicly available PoISpice estima¬ 
tor [30, 31]. These estimators have been developed for the GMB and have also been applied 
to galaxy snrveys (see e.g. [32, 33]). We assess their relative performance in this new regime 
and compare the resnlts to expectations from analytic models. 

We also stndy the covariance of the angnlar power spectrnm estimators which is of par¬ 
ticnlar importance for interpreting npcoming HI snrveys and for assessing their constraining 
power. As for the power spectrnm, we apply covariance estimators nsed for the GMB and 
galaxy snrveys. We estimate the covariance of the aforementioned estimators nsing jack- 


- 2 - 


knife and analytical estimates based on Gaussian statistics and compare the results to the 
covariance estimated from a suite of simulated HI intensity maps. 

The paper is organized as follows. In section 2 we briefly review the principles of intensity 
mapping. The details of our simulations are discussed in section 3. In section 4, we use these 
simulations to study their angular power spectra. The covariance of the estimated angular 
power spectra is analyzed in section 5. We summarize and conclude in section 6. 


2 HI intensity mapping 


We hrst briefly review the principles of HI intensity mapping at low redshifts. We discuss the 
observational principles for the example of a single dish experiment before giving a summary 
of the theoretical expectations for the clustering of the large-scale HI distribution. For more 
details see e.g. [34] for a review. 


2.1 Cosmological 21 cm line emission 


Mapping the matter distribution in our Universe is a central part of observational cosmology. 
Traditionally, this is done by studying the distribution and the shapes of galaxies in wide 
held surveys. In HI intensity mapping, the idea is to study the distribution of HI by mapping 
the distribution of redshifted hux from its 21 cm line emission. After reionization, the bulk 
of HI is expected to be clumpy and associated with galaxies. E.g., at redshifts z < 1.5, 
damped Lyman-a (DLA) systems have been found associated with Mgll and Fell absorption 
systems [35, 36]. HI emission line stacking techniques applied to star forming galaxies at 
low redshifts z ~ 0.2-0.3 have yielded measurements of Hhi r-u 10 ^ [37, 38]. And hnally, 
HI emission studies in the local Universe {z < 0.03) have yielded individual detections of 
a large number of Hl-rich galaxies, with Hhi few X 10 ^ [39, 40]. See also [41] for a 
compilation of HI observations. In this work, we will distribute HI in dark matter halos using 
a phenomenological prescription proposed in [21] that is consistent with these observations of 
Hhi and the hypothesis that HI is typically associated with star forming galaxies^. 

When mapping the flux from individual clouds of neutral hydrogen for example with the 
large beam of a single dish radio telescope, the flux of discrete but non-resolved sources on 
the sky is integrated within a physical volume dehned by the size of the beam and a range in 
frequency. Very large single dish instruments like the Green Bank Telescope with a diameter 
of 100 m have a resolution of order 10^ at z ~ 1. In frequency, however, radio telescopes can 
have channels with a bandwidth of ~ 1 MHz and smaller, corresponding to redshift bins of 
width Az ~ 0.003 at redshift z ~ 1. In co-moving distances, the intensity maps from such an 
instrument would hence have comparable angular and radial resolution of order rsj 10 h ^ Mpc 
at this redshift. In the more distant future, however, intensity mapping surveys with long 
baseline interferometers like the SKA will be able to improve the angular resolution by many 
orders of magnitude. 

Given an average density of neutral hydrogen pm{z) at redshift z or equivalently its 
ratio to the critical density today Hhi( 2;), the average brightness temperature of 21cm flux is 
given by [18] 


r(z) ~ 44 pK 


/ Q.]ii{z)h \ 
V2.45 X 10-4 ) 


(1 + ^)^ 
E{z) 


( 2 . 1 ) 


^We do note that recent measurements of the clustering of DLA systems at redshifts 2 < z < 3 [42] are 
discrepant with direct HI observations on which our model is built [21]. The resolution of this discrepancy is 
unclear at present, and we will return to this issue in future work. 
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with E{z) = H(z)/Hq being the normalized Hubble parameter. At redshift z ~ 1, this 
corresponds to brightness temperatures of approximately 100 pK and is therefore roughly four 
orders of magnitude sub-dominant to the brightness of our own galaxy at these frequencies 
which is at the ~ 1 K level even at high galactic latitudes. One of the key challenges of this 
technique is hence the ability to separate the signal from the Galaxy and other extra-Galactic 
radio sources [43-46]. 


2.2 Large-scale structure of HI 

The fluctuations of 21cm brightness temperatures are expected to be a biased version of 
the fluctuations in the matter density held. Hence, the Fourier transformed temperature 
huctuations in the map at large scales can be written as 

5T{k, z) = T{z)b-m{k, z)6{k, z), (2-2) 


where 5 is the dark matter overdensity and b}ii{k, z) is the bias of HI relative to 5. 

On even larger scales, the power spectrum of the matter density huctuations is well 
described by linear theory predictions and the halo bias is scale independent. Assuming a 
simple relation between HI mass mni and halo mass m, the large scale bias of HI 

bm{z) can be predicted from the halo mass function using: 

z)= ( 2 . 

J Pm[z) 



where pm{z) = f dmN{m, z) mHi(m) is the mean density of HI, N(m, z) is the unconditional 
differential number density of halos and b{m, z) is the corresponding halo bias (we use the 
Sheth-Tormen forms [28]; see also Section 3.2). The large scale power spectrum of HI is hence 
expected to be well described by 

Pm{k,z)c:^ [fiz)bm{z)D{z)]^ P{k), (2.4) 


where P{k) is the linear theory power spectrum of the matter overdensities at z = 0 and D{z) 
is the linear growth factor. 

As wide-held surveys probe a spherical region rather than a cartesian box, different 
methods for measuring the two-point correlations have been proposed and compared (see 
e.g. [47-49]). One approach is to perform a tomographic analysis based on the angular corre¬ 
lations (or equivalently the power spectrum) of the held within bins of redshift. This approach 
is convenient for intensity mapping as techniques for its estimation from maps are well estab¬ 
lished from the GMB [30, 31, 50-54], there is no need for assuming a cosmology during the 
analysis [55, 56], and cross-correlating different surveys is straight forward [57]. Furthermore, 
intensity mapping surveys will produce maps of brightness temperature huctuations within 
bins in frequency and thus a tomographic analysis within these bins does not erase any in¬ 
formation. The angular power spectrum Ci^{z,z') of the temperature huctuations is related 
to P{k) via 


Ce{z,z') = - / dzW{z)f{z)D{~z)bm{z) / d~z'W'{z')f{z')D{z')bm{z') 


TT 


X / k‘^dkP{k)jeikR{z))jeikR{z')), 


(2.5) 
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where W, W are the redshift window functions for the two tomographic bins around 2 ;, z' and 

is the co-moving distance to redshift z. 

An additional contribution to the angular power spectrum arises from the shot noise 
of the discrete sources. Given a population of unresolved, Poisson distributed point sources 
with neutral hydrogen mass ttihi distributed over the full sky, the power spectrum of the 

resulting intensity map is given by f drriHi nHl(?7iHl) nHi(m,Hi) being 

the differential source count per steradian and ?fiHi(^) being the mean hydrogen mass of 
the population [58]. For our maps, this contribution will however turn out to be negligible 
(< 5 X 10“^ pK^ at the lowest redshift) due to the large number of low mass halos. 

3 Simulations of HI intensity maps 

In the following, we describe how we construct simulated HI intensity maps using the dark 
matter held of N-body simulations, the halo model, and a phenomenological prescription for 
assigning HI mass to halos. 

3.1 Matter density fields 

We use 10 N-body simulations run with the L-Gadget2 code (based on Gadget2 [59, 60]). 
These simulations are also being used to simulate galaxy catalogs for the Dark Energy Sur¬ 
vey [27, 61, 62], and a subset of these catalogs have been used previously in several Dark 
Energy Survey studies [33, 63-66]. In this work, we use only the dark matter distribu¬ 
tion from these simulations. Each of the 10 simulations contains 2048^ particles of mass 
1.6 X 10^^ h“^ Mq in a 2.6 h“^ Gpc box and produces lightcones on the hy by writing all par¬ 
ticles that cross the lightcone surface at each time step of the simulation. The simulations use 
a hat AGDM cosmology with matter density relative to critical Qm = 0.286, baryon density 
relative to critical Hb = 0.047, Hubble parameter h = 0.72, root mean squared density huctu- 
ations at 8/i“^Mpc given by erg = 0.82 and spectral index of the primordial power spectrum 
Us = 0.96. We use the lightcone output to create HEALPix^ [67] maps of the projected mat¬ 
ter distribution with an angular resolution of about 7' (ngide = 512) by counting the number 
of particles per pixel, where a pixel is the cosmological volume dehned by a HEALPix cell 
and a redshift bin. We choose the redshift bins such that they correspond to the redshift of 
the 21cm line in equally spaced bins in frequency between 750 MHz and 1050 MHz with a 
bandwidth of 25 MHz per channel. This results in 12 maps of the radially averaged density 
held between redshifts 2 ; ~ 0.35 and 2 ; ~ 0.89 with negligible shot noise and redshift bin width 
from A 2 : ~ 0.033 at the lowest to Az ~ 0.061 at the highest redshift. In co-moving scales, 
our pixels have an angular and radial size of 2h“^Mpc and 82h~^Mpc at 2 ; ~ 0.369 and 
4h“^ Mpe and 110 h“^ Mpe at 2 : ~ 0.863, respectively. As the co-moving distance to 2 ; = 0.89 
(2.1 h“^ Gpc) is bigger than half the size of the box (1.3h“^ Gpc), a full-sky lightcone would 
overlap in the simulated box. To balance overlap and sky coverage, we analyze four quadrants 
of the lightcone separately. This procedure hence yields 40 realizations of the matter density 
held, each covering a quarter sky. The volume of one quadrant corresponds to roughly 53% 
of the total box volume. 11% of the quadrant volume is overlapping in the box, but most 
of this overlapping volume appears at different redshifts. The largest overlap within a single 
redshift bin is 3% in the highest redshift bin around 2 ; ~ 0.86. 

^http://healpix.sourceforge.net 
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3.2 Halo assignment 

As mentioned in section 2.1, the dominant part of HI is expected to be fonnd within galaxies. 
The challenge of simnlating a wide held HI intensity mapping snrvey is hence that the signal 
is expected to come from relatively low mass halos that cannot be easily resolved by N-body 
simnlations with the box sizes needed for the sky coverage corresponding to wide held snrveys. 

Taking a cne from the coarseness of the instrnment resolntion for HI intensity mapping 
experiments which erases information abont the angnlar positions of individual sonrces, one 
possible prescription for tnrning density helds into intensity maps is to neglect the discrete 
origin of the 21cm intensity and to assnme that overdensity 5 and brightness temperatnre 
hnctnations 5T are linearly related: 

6T{z,&)^bHi{z)f{z)6{z,e). (3.1) 


In this case, the intensity map wonld be a rescaled version of the density held. In [22], for 
example, the anthors followed this approach in their work on creating fast intensity mapping 
simnlations nsing random log-normal helds. As we expect the HI bias to be closely related 
to the halo bias, we know that a scale independent, linear bias prescription is only valid at 
large, linear scales. A possible improvement over eqnation (3.1) wonld be a non-linear bias 
as in [68] which however still ignores possible scale dependencies and the stochastic relation 
between halo and density held. 

We therefore choose to go one step fnrther in exploiting the low resolntion of the intensity 
maps by combining a simnlated large scale density held with the conditional mass fnnction of 
dark matter halos as derived from the halo model. The coarseness of the angnlar resolntion 
of intensity mapping experiments indeed means that we are not interested in simnlating the 
exact locations of the relevant dark matter halos, bnt only their total nnmber in each pixel. 
The conditional mass fnnction A/'(m|M) calibrated by [28], gives the differential nnmber of 
halos in the mass range (m, m + dm) within a Lagrangian volnme containing mass M, 

M|r(s|5)| / (H(s)-hlin(M))2' 


M {m\M)dm = 


m - S 


exp 


2(s - S) 


ds 

s-S 


(3.2) 


where s is the variance of the linear held when smoothed on the Lagrangian scale of the 
halo mass m, S is the corresponding variance at scale M, is the linearized overdensity 

corresponding to the mass M, and B{s) is the moving barrier shape associated with ellipsoidal 
collapse: 

H(s,z) = Va<5sc(2:) + /3 (3.3) 

where 5sc{z) is the critical overdensity for collapse at redshift z and the htting parameters 
are a = 0.707, a = 0.615, and (3 = 0.485. Finally, r(s|S) is the 5th order Taylor series of 
B{s) — (5iin(M) aronnd s = S. 

What we have available from the pixelated density helds is the mass M or eqnivalently 
the (non-linear) overdensity 5 contained in each pixel. Ideally we wonld like to nse a mass 
fnnction conditioned on this 6. In order to apply eqnation (3.2) to the density held, however, 
we need to relate 5 to its linearized overdensity In order to maintain a physical 

link between 6 and hiin, we do this by assnming that the density in each pixel approximately 
follows spherical evolntion, in which case we can write [69]: 


l + <5 = 



(3.4) 


6 








Halo mass 


Figure 1. Number of sampled halos within a given mass bin and averaged over all pixels in the lowest 
(blue) and highest (green) redshift bin. The dashed black lines show the predictions of the uncondi¬ 
tional mass function from [28]. The distribution of the sampled halos agrees with the prediction to 
better than 10%. 


where we have suppressed the mass and redshift dependence. We have tested the approxi¬ 
mation by checking that the distribution of (5iin returned by this procedure is approximately 
Gaussian for our density fields (skewness ~ —0.14 and excess kurtosis ~ —0.10 at the lowest 
redshift). Going to maps with even smaller redshift bins might be desirable in the future, as 
radio telescopes can have frequency resolutions that are better than the 25 MHz chosen for 
our simulations. This would however either require better modeling of (5iin, for example by 
taking the initial conditions of the simulation into account, or fitting the conditional mass 
function directly to the non-linear density field. 

Furthermore, the relation between mass and variance requires us to choose a smoothing 
filter. The derivation of the mass function from excursion set theory assumes a top-hat filter 
in Fourier space [70], while halos are typically defined as localized objects in real space. As 
the results only depend on the variance of the filtered field, standard practice is to use a 
spherical top-hat filter in real space to relate the halo mass m to the variance s. In [28], also 
the mass M of the local overdensity is related to a variance S with a spherical top-hat filter. 
For the pixelated density field, however, a spherical filter is a poor approximation for relating 
M and S', since our pixels have complicated elongated shapes. In order to be consistent, we 
therefore set S to be the actual variance of the linearized pixelated density field across each 
map. We have checked that the procedure described above leads to an average mass function 
that is within 10% of the unconditional mass function obtained by setting S' —)• 0 and (5iin —)• 0 
in equation (3.2) (see Figure 1). 

When populating the density field with halos, we first define a minimum and maximum 
halo mass. We then Poisson sample the number of halos in each pixel around a mean of 

^TTlmax 

N{M) = / dmM{m\M), (3.5) 

</TTTmin 
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Algorithm 1 Algorithm for simulating HI intensity maps. 

1: create mass map of matter distribution from N-body lightcone output 
2: for mass M in pixel i of the mass map do 
3: turn M into an overdensity 6 

4: turn 6 into a linearized overdensity Sun using eq. (3.4) 

5: calculate J\f {m\M)dm using eq. (3.2) 

6: sample the number of halos n from a Poisson distribution centered on N{M), eq. (3.5) 

7: sample n halo masses from J\f{m\M)/N(M) using inversion sampling 

8: turn {nij} into HI masses {mnij} using eq. (3.6) 

9: set pixel i of HI mass map to Ylj ^-Hlj 

10: end for 

11: turn HI mass map into 21cm intensity map using eq. (2.1) 


where M is the mass in the pixel. Finally, we sample a mass m for each of the halos from the 
distribution^ J\f{m\M)/N{M). We end up with a sample of halo masses that is drawn from 
the conditional mass function for each individual pixel on the map. 

When following the described procedure, we however ignore the effect of peculiar veloc¬ 
ities on the simulated maps. Peculiar velocities of the halos can shift the 21 cm line of the HI 
within some of the halos into neighboring frequency bins, thereby distorting the line-of-sight 
boundaries between the pixels. As the prospects of redshift space distortion (RSD) measure¬ 
ments with the 21cm line are promising [14, 72], we propose a way for including them in the 
simulations in Appendix A.3. 


3.3 HI intensity maps 

To simulate the sky as seen by a HI intensity mapping survey, we need to turn the distribution 
of halos and their masses into a distribution of HI mass. For a summary of different approaches 
to this problem see [41] and [21]. Our approach closely follows work in [25]: For halos that 
satisfy a constraint on the circular velocity of the halo, we assign a fraction of the halo mass 
as HI mass. As proposed in [21], we use a halo mass to HI mass ratio a = 0.15 together with 
an exponential cut-off at circular velocities Vcfl = 30 km / s and Uc,i = 200 km / s: 


mHi(m) 


a(l — Yp)-^mexp 

^ 'm. 




(3.6) 


where m is again the mass of the halo. Hb and are set by the simulations and we use 
a Helium fraction of = 0.24. Using equation (3.6) to assign a HI mass to all the halos 
in a given pixel, the total HI mass in the pixel is simply given by the sum over all halos. 
We can finally rescale the resulting HI mass map into an intensity map of 21cm brightness 
temperature by using equation (2.1). 

Algorithm 1 gives a summary of the proposed procedure for generating HI intensity 
maps. For the dark matter halos, the main assumptions of our procedure are that the lin¬ 
earized field is well approximated by equation (3.4) (line 4) and that the conditional mass 
function sampling within our pixels (line 5 and 6) is a good description of the halo distri¬ 
bution. In Appendix A we study those steps in more detail. Our main arguments for the 

^In principle, one might worry that this procedure does not conserve mass [71]; in our case, however, we 
have rrimax ^ M and we have checked that mass conservation is not a problem in practice. 
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Figure 2. Simulated HI intensity maps in units of pK brightness temperature at z ~ 0.863 (top) and 
2 ~ 0.369 (bottom). The zoom regions have side lengths of 15°. 


validity of our approach are the good agreement of the resulting unconditional halo mass 
function (see Figure 1) and large scale bias (see Figures 4 and 5) with the analytical results 
from [28]. The analytic expressions have been shown to be in reasonable agreement with 
N-body simulations [73-77], and the fact that we are reproducing the analytical results at 
the mass function and bias level is encouraging. We leave more detailed investigations of 
the accuracy of our procedure with high-resolution simulations of smaller volumes for future 
work. 

Figure 2 shows the simulated HI intensity maps at the lowest and highest redshift (z ~ 
0.369,0.863, respectively). In Figure 3, we show the relation between dark matter overdensity 
6 and brightness temperature fluctuations 6T induced by the conditional halo mass function 
sampling. Similar to findings in studies with high resolution N-body simulations (see e.g. [78]), 
we find that the relation between halo overdensity and matter overdensity deviates from a 
linear behavior for large overdensities. Figure 3 also shows the individual distributions of 5 
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z = 0.369 z = 0.863 



Figure 3. Relation between matter overdensity S and temperature fluctuations ST /T in the simulated 
maps at redshift z ~ 0.863 (right) and z ~ 0.369 (left). The one-dimensional histograms show the 
individual distributions of S and ST/T along with a log-normal distribution with the same mean 
and variance (black line). The two-dimensional histograms show the non-linear, stochastic relation 
between S and ST/T as compared to a simple linear bias model (black line). Both the color-scaling 
of the two-dimensional histograms and the y-axis of the one-dimensional histograms are log-scaled. 




Figure 4. Comparison of density (left panel) and large scale biasing of HI (right panel) as a function 
of redshift as predicted by the unconditional mass function from [28] (green) for the prescription of 
eq. (3.6), as measured on our simulated intensity maps (red), and from the compilation in [41] (blue). 
The values from the unconditional halo model agree well with the simulations by construction, with 
small deviations coming from the imperfect linearization (equation (3.4)). The rini values from our 
prescription are slightly high with respect to the compilation in [41] . Once better data is available, a 
more significant mismatch in amplitude could be easily fixed by adjusting a in equation (3.6). 


and 5T together with a log-normal distribntion of the same mean and variance. 

The left panel of Fignre 4 shows a comparison of Ohi(.z) resnlting from onr prescriptions 
along with the data points of the compilation in [41] consisting of damped Lyman-a and 
intensity mapping observations. We can see that eqnation (3.6) yields a somewhat high Ohi 
compared to the data compilation, as the prescription was htted to another data compilation 
and for a cosmology with a lower cjg. Qh! conld be matched more precisely by adjnsting the 
parameter a in eqnation (3.6) which wonld simply resnlt in an overall rescaling of all maps. 
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As all our results are independent of the overall amplitude, we decided to simply adopt the 
value of a proposed in [21], The mapping between halo mass and HI mass will not be one- 
to-one in reality, and more work has to go into the question on how much coarse graining is 
allowed when relating halo and HI mass as more data becomes available. A more detailed 
modeling of e.g. the scatter around this relation is however beyond the scope of this work. 


4 Angular power spectrum 

In this section, we present our results on the angular clustering of the simulated HI intensity 
maps as described by their angular power spectrum. We first describe the estimators that we 
apply to our intensity maps and then discuss how the non-linearities affect the angular power 
spectrum. We want to study how the estimators are affected by the non-linear, non-Gaussian 
field. The non-linearities are strongest at low redshifts, and we will in the following focus 
on the auto-correlation spectrum Ci{^z) in the lowest redshift bin. We contrast the results 
with the auto-correlation spectrum in the highest redshift bin to compare the performance 
for different levels of non-Gaussianity. 


4.1 Estimation 

Given a map with brightness temperatures Tj drawn from a field with underlying angular 
power spectrum Ci and its coefficients aim of the map’s expansion in spherical harmonics, the 
angular power spectrum can be estimated as 




1 

WTi 


e 

^ ^ \0‘lm 
m=—£ 


2 


(4.1) 


When only parts of the sky are available, recovering the spherical harmonics from the masked 
field is more complicated. The pseudo Ci estimator [29] is a standard approach to this 
problem: In this approximation, the masked pixels are set to the mean value of the unmasked 
part before the full sphere is decomposed in spherical harmonics. The power spectrum is then 
estimated via 


Apseudo 


1 

(2^+l)/sky 


m=—i 


(4.2) 


with /sky being the fraction of unmasked sky [see also 50-54, for more sophisticated ap¬ 
proaches] . 

In addition to the pseudo Ci estimator, we also consider the publicly available package 
PolSpice [30, 31] in our analyses. PolSpice first estimates the correlation function of the 
masked fluctuations, corrects for the mask, and then calculates the demasked angular power 
spectrum from the corrected correlation function. To avoid artifacts from the decomposition 
of an incomplete correlation function of the masked field, the correlation function has to 
be smoothed with an apodization function. We used Gaussian random fields with a mask 
corresponding to those applied to the simulated intensity maps to fix the parameters of a 
Gaussian apodization (width of cr = 90° and a maximum angle of the correlation function 
Pmax = 120 °). 
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Figure 5. Angular power spectrum as estimated from the matter density (green) and HI intensity 
maps (blue) using PolSpice, averaged over all realizations of the simulations. We show the dimension¬ 
less spectrum for the intensity maps, scaled by the mean brightness temperature. The theory predic¬ 
tion (black) for the matter power spectrum are calculated with CLASS [79, 80] and halofit [81, 82[. 
The thick lines show the spectra when ignoring redshift space distortions, while the thin lines are 
showing the results for the distorted maps (see Appendix A.3 for more details on our modeling of 
redshift space distortions). The bottom panel shows the ratio between the predicted matter spectrum 
and the estimated spectra for the matter and intensity maps without redshift space distortions. As 
expected, the spectrum of the matter field is consistent with the prediction and their ratio is close to 
one. Deviations after i ~ 700 are due to pixel effects. The ratio of intensity mapping spectra and the¬ 
ory prediction for the matter field should be compared to the large scale HI bias from equation (2.3) 
(red). We see that the HI bias matches the ratio at large scales (i < 50) and that there is a scale 
dependence at smaller scales. 


4.2 Results 

In Figure 5, we show the angular power spectrum as estimated from the matter density and 
HI intensity maps using PolSpice, averaged over all 40 realizations. We also show the theory 
predictions for the angular power spectrum of the matter density field from equation (2.5) 
as calculated with CLASS [79, 80], using halofit [81, 82] to model the non-linearities. The 
bottom panel shows the ratio of the estimated to the predicted power spectra. Figure 5 also 
shows the angular power spectra of the maps when including redshift space distortions (thin 
lines, see Appendix A.3 for details on our modeling of peculiar velocities). In linear theory, 
the redshift space distortions lead to a modification of the power spectrum [83] that results in 
an enhancement of the angular power spectrum at large scales but does not affect scales much 
smaller than the size of the redshift bin (see e.g. [55, 84] for more details). Figure 5 shows 
that the linear theory behavior of the redshift space distortions as calculated with CLASS is 
correctly reproduced by our distorted maps. 

As expected, the matter power spectrum is consistent with the theory prediction over 
most scales with a deviation at small scales {i > 700) due to pixel effects. The effect of 
non-linearities in the density field at the power spectrum level is hence well modeled by the 
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halofit corrections to the linear power spectrum. The agreement also confirms that, to the 
precision we are interested in, the overlap of a few percent per mass map within the simulated 
box (see also section 3.1) is not affecting the angular power spectra. 

The ratio between the intensity mapping spectra and the predicted matter density is 
seen to be close to on large scales (£ < 50). Even though the agreement with the 
analytic prediction for the large scale bias is expected, it is a non-trivial consequence of the 
conditioning of the mass function on the overdensities within each pixel. Figure 5 hence shows 
that our prescription for assigning halos to the density field reproduces the halo statistics at 
the two-point level on large scales remarkably well. On smaller scales, our modeling of the 
halo distribution with the conditional mass function yields a mildly scale dependent bias in 
particular at the lowest redshift, where the ratio falls by ~ 10% of its large scale value at 
i ~ 200. The right panel of Figure 4 shows a comparison of 6 hi from equation (2.3) with the 
large-scale bias in the simulated intensity maps, estimated as the square root of the average 
ratio of the intensity mapping angular power spectra and the theory predictions on scales 
10 < £ < 50. Figure 4 also shows good agreement with a compilation of bias values in [41], 
derived from a selection of theoretical prescriptions from the literature. 

5 Covariance of the angular power spectrum 

For precise cosmological inference from the angular power spectrum, it is important to know 
the distribution of its estimator. To first order, this means that we need to estimate the 
covariance of the angular power spectrum as measured from intensity maps. As in section 4, 
we first describe our estimation procedure before discussing the results, focusing in particular 
on the effect of non-linearities on the covariance of the angular power spectrum. 

5.1 Estimation 

We describe and compare three standard approaches to the problem of covariance estimation: 
estimation from multiple simulations, estimators based on Gaussian random fields and a 
jackknife approach. 

5.1.1 Multiple simulations 

As already mentioned in section 3.1, we use the density field from 10 independent N-body 
simulations as the basis for our intensity maps. We use each of the boxes to create 4 simula¬ 
tions which cover a quarter of the sky. If we focus on the first quadrant of each simulation, 
then we estimate the covariance in this case as: 

1 ” 

Cov(Q,Q0 = — ^(q - C,){C\, - Ce), (5.1) 

i=0 

where n = 10, is the power spectrum estimated from the realization (this could be 
either the pseudo Ci or PolSpice estimator), Ci = (1/n) Y17=o mean over all realiza¬ 

tions and the n — 1 denominator is chosen to yield an unbiased estimator of the underlying 
covariance. We do this for each quadrant and take the final covariance matrix estimator to 
be the arithmetic mean of Cov(C'£, G^/) over all four quadrants. This way we avoid that cor¬ 
relations between the four quadrants introduced through the finite size of the simulated box 
affect the estimated covariance. In Appendix A. 3, we additionally study the effect of redshift 
space distortions on the estimated covariances. 
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5.1.2 Gaussian approximation 

In the case of an isotropic Gaussian random field on a full sphere, the coefficients aim of the 
spherical harmonic decomposition are independent Gaussian random variables and identically 
distributed for each t. Gonsequently, the estimator Ci (4.1) is chi-squared distributed with 
mean Ci and variance [85] 

Var(Q) = (5.2) 

The covariance Cov(C'£, Ci') is zero for in this case. 

As for the estimation of the Cgs, these considerations get more involved in the case of 
partial sky coverage (see e.g. [50-54, 86 ]). A standard approximation for the variance of the 
pseudo Ci estimator is given by [53] 

where as before /sky is the fraction of unmasked sky and Cov(Cj'^®“'^°, is approximated 

to be zero for £ 7 ^ t". Since this approximation neglects correlations between individual i scales 
as introduced by the mask, it is a good approximation only for bandpowers 


Bi 


1 

2A£+1 


i+Ae 

E 


i'=i-Ae 


(5.4) 


where Ai is chosen such that the correlations between the bandpowers is small. Following 
the definition of Bi, the variance of the bandpowers relates to the Ci variance as follows: 

e+Ae e+Ai 

= (WTW ^ ^ Cov(C,,,C,,,). (5.5) 

' ' e=i-Aii"=i-Ai 


Besides the pseudo Ci approach, we will also use the PolSpice estimator for the covari¬ 
ance [54] which is based on Gaussian field statistics and takes the mask into account. We 
calculate the Gaussian field estimators for each simulation individually and report the results 
in section 5.2. 


5.1.3 Jackknife estimation 

Resampling techniques aim at estimating the covariance of an estimator by manipulating the 
sample on which it is based. The advantage of resampling techniques is that they do not 
assume a field with specific statistical properties. The disadvantage is that they typically 
assume independence of different parts of the sample which is not satisfied by correlated 
fields. In a jackknife approach, one studies the behavior of an estimator when parts of the 
data are ignored in the estimation process. In the context of our maps this implies that we 
study the distribution of angular power spectra as derived from maps where a subset of the 
simulated brightness temperature pixels is masked. We split the simulated part of the sky 
into Tijack parts and estimate all power spectra (using either the pseudo Ci or PolSpice 

estimators) when the z**® of the njack parts is masked out. The covariance of the angular power 
spectrum is then estimated as 

T7.J 3j(2k 

Covjack(Q, QO = £ (Cf- Gf'^)(Cf - Gf") (5.6) 

^jack 
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with = (l/rajack) analysis, we define the masked-out patches as 

pixels in coarser HEALPix maps. The size of the patches was determined after an analy¬ 
sis of stochastic helds and is discussed in Appendix B. As for the Gaussian held estimate, 
we calculate a jackknife estimate individually for each simulation and report the results in 
section 5.2. 

5.2 Results 

To present our results on the covariance of angular power spectra of HI intensity maps, we 
decompose the covariance into its diagonal part (the variance) and its off-diagonal part (the 
correlation matrix) as given by: 


ct 2(Q) = Cov(Q,Q), 


Corr(Q,C,0 


Cov(Q,C'fH 

a{Ct)a{Ce) ■ 


(5.7) 

(5.8) 


For simplicity, we will call the results regarding the covariance estimated from the 10 indepen¬ 
dent realizations of the intensity maps the simulation estimator in the following. We present 
a covariance analysis for Gaussian and log-normal random helds with similar statistical prop¬ 
erties as the simulated intensity maps and for our survey geometry of a contiguous quarter 
of the sky in Appendix B. 


5.2.1 Variance 

Figure 6 shows the different estimates of cr{C() for the lowest {z ~ 0.369) and highest 
{z ~ 0.863) redshift bin of the intensity map simulations presented in section 3. For all 
redshifts and estimators, the jackknife estimator (5.6) is consistently biased^ to higher a{Ci) 
by approximately ~ 40%, an effect that is also seen for the stochastic held models in Ap¬ 
pendix B. The PolSpice estimator is consistent with the simulation estimator within the 
statistical noise. We hnd that the pseudo Ci estimate for cr^Ci) given in equation (5.3) pre¬ 
dicts a diagonal error on the angular power spectrum that is a factor of ~ 2 higher than the 
simulation estimator, compensating for the lack of correlations (see Figure 6). To account 
for this effect, we also show the results for the variance of pseudo Ci bandpowers for a 
bandwidth of Ai = ±5 (see equation 5.4). As shown in Figure 7, the bias of the pseudo Ci 
estimate for a{B^) drops below 10%. We have checked that this bias decreases further when 
increasing Ai at the cost of losing more and more information due to the averaging over i. 

Overall, the variance is hence in good agreement with predictions for Gaussian helds, 
even though the temperature huctuations follow the complex distribution imprinted by grav¬ 
itational clustering. Being more biased while having a similar variance, the results from the 
jackknife approach are clearly outperformed by the Gaussian estimate. The results from this 
section are consistent with our analysis of stochastic helds in Appendix B. In Appendix A.3, 
we furthermore investigate the effect of redshift space distortions on the estimator variance 
of the angular power spectra and do not hnd deviations between Gaussian prediction and 
estimated variance either. 

^Whenever we talk about bias in this section, we refer to the bias of an estimator in the statistical sense, 
i.e. as a mismatch between expected value of the estimator and the true, underlying quantity. This is not to 
be confused with the concept of halo and HI bias discussed in sections 2.2 and 4.2. 
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Figure 6. Standard deviation a{Ci) of pseudo Cg (top) and PolSpice (bottom) estimators as esti¬ 
mated from the 10 independent simulations (black), the jackknife (blue), and the Gaussian assumption 
(green) for the lowest {z ~ 0.369, left panel) and highest {z ~ 0.863, right panel) redshift bin of the 
simulations. The bottom panel of each plot shows the ratio of the jackknife and Gaussian estimator 
to the result from the 10 realizations. The shaded areas in the bottom panels show the standard 
deviation of the ratio, estimated from the distribution of the jackknife and Gaussian estimator over 
the 10 realizations. For the pseudo Cg estimator, a{Cg) is overpredicted by the Gaussian estimator 
by a factor of ^ 2. Within the noise, the PolSpice estimate is consistent with the result from the 
independent simulations. The jackknife estimate overestimates a{Ci) by about 40%, independently 
of scale, redshift, and estimator. 
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Figure 7. Standard deviation a{Bi) of pseudo Ci bandpowers defined in equation (5.4) as estimated 
from the 10 independent simulations, the jackknife, and the Gaussian assumption for the lowest 
(z ~ 0.369, left panel) and highest (z ~ 0.863, right panel) redshift bin of our simulations. The bottom 
panel shows the ratio of the jackknife and Gaussian estimator to the result from the 10 realizations. 
The shaded areas in the bottom panels show the standard deviation of the ratio, estimated from the 
distribution of the jackknife and Gaussian estimator over the 10 realizations. The bandpowers are 
averaged over a bandwidth of A£ = 5. The bias of the pseudo estimate for a{Bi) is dropping from 
90% in the case of the unbinned spectrum (Figure 6) to below 10% for the bandpowers. 


5.2.2 Correlations 

For a full-sky Gaussian field that is statistically homogeneous and isotropic, we do not expect 
any correlations between different £ scales. Non-zero off-diagonal elements in the correlation 
matrix can however be introduced by the finite mask of the map and through the non-Gaussian 
nature of the matter density field. Since the PolSpice estimate of the angular power spectrum 
and its covariance already accounts for the mask, it is particularly interesting to compare the 
latter with correlation estimates from the independent simulations and the jackknife, thereby 
enabling us to distinguish masking effects from non-Gaussianity. 

Figure 8 shows Gorr(C'£, C^/) as a function of £! for the lowest (top row) and the highest 
(bottom row) redshift bin and at small = 650, left column) and large scales {I = 25, right 
column). Overall, the PolSpice estimate appears to be in good agreement with the simulation 
estimator within the noise. The jackknife estimate is biased to negative correlations for scales 
separated by 5 < < 15 and is in good agreement with the simulation and PolSpice estimate 

otherwise. 

To reduce the statistical noise of the simulation estimator, we also show the correlation 
Gorr(C'£, C£_|_ 5 £) as a function of bt when averaged over neighboring i values (we decide to 
average over 100 fe). The result is shown in Figure 9. Again, we find good agreement between 
the PolSpice estimate and the simulation results. Yet, for small scales and low redshifts 
(top right of Figure 9), there are ~ 5% extra correlations for bl > 15 in both jackknife 
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Figure 8 . Correlations Corr(C'f, C^/) for the PolSpice angular power spectrum as estimated from the 
10 independent simulations, the jackknife, and the Gaussian assumption for the lowest (z ~ 0.369, 
top panels) and highest (z ~ 0.863, bottom panels) redshift bin of the simulations. Correlations 
between large scales (around i = 25) are shown on the left, small scales (around I = 650) are shown 
on the right. The shaded areas show the standard deviation of the jackknife estimator from the 10 
independent simulations. The estimate from the 10 realizations is consistent with the PolSpice result 
within the noise. 


and simulation estimators. The bias to negative correlations of the jackknife estimate for 
5 < (5^ < 15 can be seen in all four panels of Figure 9. 

The pseudo Ci estimator for the covariance neglects correlations, hence its correlation 
matrix is simply the identity matrix. As expected, both the simulation and the jackknife 
estimator show that the pseudo Ci covariance is not diagonal in practice. Its off-diagonal 
contributions from mask and non-Gaussianity are very similar to those reported for the Pol¬ 
Spice covariance. 

The overall agreement between the correlations predicted for a masked Gaussian field, 
the simulation estimator, and the jackknife is remarkably good. While the correlations be¬ 
tween individual I values estimated from the simulations are too noisy for a detailed compar¬ 
ison, the averaged results in Figure 9 indicate deviations of approximately 8 % for small scales 
at the lowest redshift z ~ 0.369 which are not detectable at the highest redshift z ~ 0.863. 
The analysis of Appendix A.3 shows that this behavior remains unchanged when taking red¬ 
shift space distortions into account. As for the variance, the results on the correlation matrix 
are in line with our findings from the analysis of stochastic fields in Appendix B. While 
the correlations introduced through the mask are present at all scales, the correlations from 
the non-Gaussianity induced by non-linear gravitational clustering only affect small scales. 
Instrumental noise will also mostly affect small scales, suppressing the contributions from 
non-Gaussianity for realistic surveys. 
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Figure 9. Mean correlation (Corr(C'f, Ct+si)) for the PolSpice angular power spectrum, averaged over 
100 £ values. We show estimates from the independent simulations, the jackknife, and the Gaussian 
assumption for the lowest {z ~ 0.369, top panels) and highest (z ~ 0.863, bottom panels) redshift 
bin of our simulations. Correlations between large scales (0 < £ < 100) are shown on the left, small 
scales (600 < £ < 700) on the right. The shaded areas show the standard deviation of the jackknife 
estimator from the 10 realizations. 

6 Conclusions 

Intensity mapping of the redshifted 21 cm emission line of neutral hydrogen (HI) is a promis¬ 
ing technique for studying the large-scale distribution of matter in the Universe and will po¬ 
tentially yield complementary information to more traditional galaxy redshift surveys. The 
majority of the HI content of the low redshift {z < 1) Universe is believed to reside in dense 
clumps inside galaxies. The low angular resolution of radio telescopes combined with high 
frequency resolution then means that 21 cm intensity mapping surveys will map an unresolved 
(in angular position) population of galaxies containing HI in fine redshift bins. In this paper 
we have simulated wide field HI intensity maps and used them to study the statistical proper¬ 
ties of the expected brightness temperature signal, in particular the angular power spectrum 
Ci and its covariance. 

We have simulated HI intensity maps at redshifts 0.35 ^ z < 0.9, based on a suite of 
10 N-body simulations of a 2.6 h“^ Gpc box [27] and a mass resolution of 1.6 x 10^^ h“^ Mq. 
We created HEALPix [67] maps of the dark matter density field from the N-body simulations 
in equally spaced bins in frequency. Since our simulations do not resolve all the low mass 
halos that are expected to host HI, we have used the coarse density field in each pixel to 
statistically sample these halos using a halo mass function conditioned on the overdensity of 
the pixel following [28]. The low angular resolution of the intensity map means that we do not 
need to assign locations to individual halos within each pixel. Instead, each of these halos is 
assigned an amount of HI determined by its dark matter mass following a phenomenological 
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prescription based on reference [21], The resnlting distribntion of HI has then been nsed to 
generate 12 maps of redshifted 21 cm brightness temperatnre with an angnlar resolntion of 
7' between z ~ 0.35 and z ~ 0.89 and redshift bin widths from IS.z ~ 0.033 at the lowest to 
Az ~ 0.061 at the highest redshift. 

The main assnmptions of onr prescription are that the halo distribntion within each pixel 
is determined by its overdensity and well described by the conditional mass fnnction [28], the 
linearized overdensity in each pixel is well approximated by the valne derived from spherical 
evolntion [69], and that the HI content of the halos is determined by the mass of the halo [21]. 
As a (non-trivial) validity check for the halo snbsampling, we confirmed that onr procednre 
reprodnces the nnconditional mass fnnction and the large scale bias to better than 10%. 
With high-resolntion simnlations of smaller volnmes, the accnracy of onr procednre conld 
be assessed in more detail and we leave snch a stndy to fntnre work. Assessing the validity 
of the HI assignment is more complicated. The model from [21] is able to reprodnce data 
of sky-averaged valnes snch as the HI density Hhi or the bias 6 hi- Dne to a lack of data 
at these redshifts, a more detailed stndy of the relation between halo and HI on an object 
by object basis wonld have to resort to hydrodynamic simnlations (see [87] for example). 
We fnrthermore showed a way for inclnding redshift space distortions in onr simnlations and 
verified that they do not affect the main conclnsions drawn from the analysis of the nndistorted 
maps in the following. 

Onr brightness temperatnre maps of the cosmic signal are snfhciently realistic to stndy 
the impact of non-linear effects on the large-scale clnstering properties of HI intensity maps. 
In onr analysis of the maps, we have focnsed on the angnlar clnstering of the brightness 
temperatnre flnctnations as measnred by the angnlar power spectrnm. We have nsed two 
estimators for the power spectrnm: the psendo Ci approach [29] and the pnblicly available 
PolSpice package [30, 31]. We have verified that the estimated angnlar power spectra of the 
simnlated intensity maps agree well with predictions from linear theory and the halo model 
on large scales. We have fonnd that the bias of HI relative to dark matter, as estimated 
from the ratio of angnlar power spectra, is close to nnity and mildly scale dependent. For 
the lowest redshift of onr simnlations (z ~ 0.366), the bias falls below its large scale valne by 
~ 10% at 200. 

Using the mnltiple N-body realizations, we have estimated the covariance of both Ci 
estimators and compared the resnlts to estimators from jackknife resampling and analytic 
predictions based on Ganssian statistics. Treating the covariance from mnltiple simnlations 
as a noisy estimate of the trne covariance, we have fonnd good agreement with the PolSpice 
covariance estimate based on Ganssian statistics. This shows that even for onr simple snrvey 
geometry (a contignons qnarter of the sky), most off-diagonal correlations are introdnced by 
the mask. Only for small scales and low redshifts, the resnlts indicate an excess of ~ 8% 
off-diagonal non-Ganssian correlations that are not captnred by the PolSpice estimate. The 
jackknife estimator overpredicted the error on the angnlar power spectrnm by approximately 
~ 40%, independently of scale, redshift and estimator. It was however able to trace 
the extra correlations introdnced by the non-Ganssian natnre of the brightness temperatnre 
flnctnations. The psendo estimator for the covariance has no off-diagonal correlations by 
constrnction and overestimated the variance by a factor of two. Looking at the variance of 
psendo Ci bandpowers, i.e. binned Ci valnes that are approximately nncorrelated, the lack of 
correlations however balanced the excess in variance and the estimate converged to the resnlts 
from the mnltiple realizations. It is worth noting that the HI prescription from [21] leads to a 
distribntion of brightness temperatnre flnctnations that is less skewed to high contrasts than 
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the matter density field and hence also more Gaussian. The lack of HI in high overdensities 
is supported by physical [25] and observational [88, 89] arguments, but changes in the HI 
prescription could nevertheless influence some of the conclusions on the covariance of the 
angular power spectrum drawn from our maps. 

Our analysis showed that our approach to simulating HI intensity maps can be used to 
simulate and study upcoming surveys. It also allows for the development of data analysis 
techniques that improve the extraction of cosmological information and the separation of 
the cosmological signal and contaminating components such as astrophysical foregrounds 
or human-made radio frequency interference. As galaxy surveys cover greater parts of the 
sky to increasing depth, cross-correlations between intensity mapping and galaxy surveys 
are of particular interest as they are less sensitive to systematic effects. In future work, such 
simulations can thus be used to additionally model the connection between halos and galaxies 
in order to study the cross-correlation signal between HI and galaxy surveys. 


Acknowledgments 

We thank Matthew Becker and Risa Wechsler for providing the cosmological simulations 
used in this work. The simulations were run using computing resources at the University of 
Chicago and NERSC. We also thank Andrina Nicola, Hamsa Padmanabhan, Oliver Hahn, 
Tirth Choudhury, and Raghunathan Srianand for insightful discussions, Francesco Montanari 
for his support with CLASS, and an anonymous referee for useful comments that helped 
improve the manuscript. Some of the results in this paper have been derived using the 
HEALPix [67] package. This work was in part supported by the Swiss National Science 
Foundation (Grant No. 200021_143906). 

References 

[1] M. Santos, P. Bull, D. Alonso, S. Camera, P. Ferreira, G. Bernard!, R. Maartens, M. Viel, 

F. Villaescusa-Navarro, F. B. Abdalla, M. Jarvis, R. B. Metcalf, A. Pourtsidou, and L. Wolz, 
Cosmology from a SKA HI intensity mapping survey, in Proceedings of Advancing Astrophysics 
with the Square Kilometre Array (AASKA14-), p. 19, 2015. arXiv; 1501.03989. 

[2] R. A. C. Croft, J. Miralda-Escude, Z. Zheng, A. Bolton, K. S. Dawson, J. B. Peterson, D. G. 
York, D. Eisenstein, J. Brinkmann, J. Brownstein, T. Delubac, A. Font-Ribera, J. C. Hamilton, 
K. G. Lee, A. Myers, N. Palanque-Delabrouille, I. Paris, P. Petitjean, M. M. Pieri, N. P. Ross, 

G. Rossi, D. J. Schlegel, D. P. Schneider, J. Vazquez, M. Viel, D. H. Weinberg, and C. Yeche, 
Large-scale clustering of Lyman-alpha emission intensity from SDSS/BOSS, ArXiv e-prints 
(Apr., 2015) [arXiv; 1504. 04088]. 

[3] T. C. Chang, U. L. Pen, K. Bandura, and J. B. Peterson, An intensity map of hydrogen 21-cm 
emission at redshift z ^ 0.8, Nature 466 (July, 2010) 463-465, [arXiv; 1007.3709]. 

[4] K. W. Masui, E. R. Switzer, N. Banavar, K. Bandura, C. Blake, L. M. Calin, T. C. Chang, 

X. Chen, Y. C. Li, Y. W. Liao, A. Natarajan, U. L. Pen, J. B. Peterson, J. R. Shaw, and T. C. 
Voytek, Measurement of 21 cm Brightness Fluctuations at z ^ 0.8 in Cross-correlation, 
Astrophys. J. 763 (Jan., 2013) L20, [arXiv; 1208.0331]. 

[5] E. R. Switzer, K. W. Masui, K. Bandura, L. M. Calin, T. C. Chang, X. L. Chen, Y. C. Li, 

Y. W. Liao, A. Natarajan, U. L. Pen, J. B. Peterson, J. R. Shaw, and T. C. Voytek, 
Determination of z ^ 0.8 neutral hydrogen fluctuations using the 21 cm intensity mapping 
autocorrelation, Mon. Not. Roy. Astron. Soc. 434 (July, 2013) L46-L50, [arXiv; 1304.3712]. 


- 21 


[6] P. Comaschi and A. Ferrara, Probing high-redshift galaxies with Lya intensity mapping, Mon. 
Not. Roy. Astron. Soc. 455 (Jan., 2016) 725-738, [arXiv; 1506.08838]. 

[7] M. Silva, M. G. Santos, A. Cooray, and Y. Gong, Prospects for Detecting C II Emission during 
the Epoch of Reionization, Astrophys. J. 806 (June, 2015) 209, [arXiv; 1410.4808]. 

]8] Y. Gong, A. Gooray, M. B. Silva, M. G. Santos, and P. Lubin, Probing Reionization with 
Intensity Mapping of Molecular and Fine-structure Lines, Astrophys. J. 728 (Feb., 2011) L46, 
]arXiv; 1101.2892]. 

]9] T. Y. Li, R. H. Wechsler, K. Devaraj, and S. E. Ghurch, Connecting CO Intensity Mapping to 
Molecular Cas and Star Formation in the Epoch of Calaxy Assembly, Astrophys. J. 817 (Feb., 
2016) 169, ]arXiv;1503.08833]. 

]10] N. Mashian, A. Sternberg, and A. Loeb, Predicting the intensity mapping signal for multi-J 
CO lines, ArXiv e-prints (July, 2015) ]arXiv; 1507. 02686]. 

]11] J. S. B. Wyithe, A. Loeb, and P. M. Geil, Baryonic acoustic oscillations in 21-cm emission: a 
probe of dark energy out to high redshifts, Mon. Not. Roy. Astron. Soc. 383 (Dec., 2007) 
1195-1209, ]arXiv; 0709. 2955]. 

]12] A. Loeb and J. S. B. Wyithe, Possibility of Precise Measurement of the Cosmological Power 
Spectrum with a Dedicated Survey of 21cm Emission after Reionization, Phys. Rev. Lett. 100 
(Apr., 2008) 161301, ]arXiv; 0801. 1677]. 

]13] E. Visbal, A. Loeb, and S. Wyithe, Cosmological constraints from 21cm surveys after 
reionization, J. Cosmol. Astropart. Phys. 10 (Oct., 2009) 30, ]arXiv; 0812. 0419]. 

]14] P. Bull, P. G. Ferreira, P. Patel, and M. G. Santos, Late-time cosmology with 21 cm intensity 
mapping experiments, Astrophys. J. 803 (Apr., 2015) 21, ]arXiv: 1405. 1452]. 

]15] T. G. Ghang, Y. Gong, M. Santos, M. B. Silva, J. Aguirre, O. Dore, and J. Pritchard, Synergy 
of CO/CII/Lya Line Intensity Mapping with the SKA, in Proceedings of Advancing 
Astrophysics with the Square Kilometre Array (AASKAI 4 ), p. 4, 2015. arXiv; 1501.04654. 

]16] J. G. Pober, A. R. Parsons, D. R. DeBoer, P. McDonald, M. McQuinn, J. E. Aguirre, Z. Ali, 

R. F. Bradley, T. G. Ghang, and M. F. Morales, The Baryon Acoustic Oscillation Broadband 
and Broad-beam Array: Design Overview and Sensitivity Forecasts, Astronom. J. 145 (Mar., 
2013) 65, ]arXiv; 1210.2413]. 

]17] R. Ansari, J. E. Gampagne, P. Golom, G. Magneville, J. M. Martin, M. Moniez, J. Rich, and 
G. Yeche, BAORadio: A digital pipeline for radio interferometry and 21 cm mapping of large 
scale structures, C. R. Phys. 13 (Jan., 2012) 46-53, ]arXiv; 1209.3266]. 

]18] R. A. Battye, 1. W. A. Browne, G. Dickinson, G. Heron, B. Mallei, and A. Pourtsidou, H i 
intensity mapping: a single dish approach, Mon. Not. Roy. Astron. Soc. 434 (Aug., 2013) 
1239-1256, ]arXiv;1209.0343]. 

]19] K. Bandura, G. E. Addison, M. Amiri, J. R. Bond, D. Gampbell-Wilson, L. Gonnor, J. F. 
Gliche, G. Davis, M. Deng, N. Denman, M. Dobbs, M. Fandino, K. Gibbs, A. Gilbert, 

M. Halpern, D. Hanna, A. D. Hincks, G. Hinshaw, G. Holer, P. Klages, T. L. Landecker, 

K. Masui, J. Mena P, L. B. Newburgh, U. L. Pen, J. B. Peterson, A. Recnik, J. R. Shaw, 

K. Sigurdson, M. Sitwell, G. Smecher, R. Smegal, K. Vanderlinde, and D. Wiebe, Canadian 
Hydrogen Intensity Mapping Experiment (CHIME) pathfinder, in SPIE Astronomical 
Telescopes -h Instrumentation (L. M. Stepp, R. Gilmozzi, and H. J. Hall, eds.), p. 22, SPIE, 
July, 2014. arXiv; 1406.2288. 

]20] A. Pontzen, Scale-dependent bias in the baryonic-acoustic-oscillation-scale intergalactic neutral 
hydrogen, Phys. Rev. D 89 (Apr., 2014) 083010, ]arXiv; 1402.0506]. 

]21] H. Padmanabhan, T. R. Ghoudhury, and A. Refregier, Modelling the cosmic neutral hydrogen 
from DLAs and 21 cm observations, ArXiv e-prints (Apr., 2015) ]arXiv; 1505.00008]. 


- 22 - 


[22] D. Alonso, P. G. Ferreira, and M. G. Santos, Fast simulations for intensity mapping 
experiments, Mon. Not. Roy. Astron. Soc. 444 (Nov., 2014) 3183-3197, [arXiv; 1405. 1751]. 

[23] A. R. Duffy, S. T. Kay, R. A. Battye, G. M. Booth, G. Dalla Vecchia, and J. Schaye, Modelling 
neutral hydrogen in galaxies using cosmological hydrodynamical simulations, Mon. Not. Roy. 
Astron. Soc. (Feb., 2012) [arXiv; 1107.3720]. 

[24] R. Dave, N. Katz, B. D. Oppenheimer, J. A. Kollmeier, and D. H. Weinberg, The neutral 
hydrogen content of galaxies in cosmological hydrodynamic simulations, Mon. Not. Roy. Astron. 
Soc. 434 (Aug., 2013) 2645-2663, [arXiv: 1302. 3631]. 

[25] J. S. Bagla, N. Khandai, and K. K. Datta, HdACi as a prohe of the large-scale structure in the 
post-reionization universe, Mon. Not. Roy. Astron. Soc. 407 (June, 2010) 567-580, 

[arXiv :0908. 3796] . 

[26] N. Khandai, S. K. Sethi, T. Di Matteo, R. A. G. Groft, V. Springel, A. Jana, and J. P. 
Gardner, Detecting neutral hydrogen in emission at redshift z ~ 1, Mon. Not. Roy. Astron. Soc. 
415 (May, 2011) 2580-2593, [arXiv: 1012. 1880]. 

[27] Wechsler, R H et al.. In preparation, 2015. 

[28] R. K. Sheth and G. Tormen, An excursion set model of hierarchical clustering: ellipsoidal 
collapse and the moving barrier, Mon. Not. Roy. Astron. Soc. 329 (Jan., 2002) 61-75, 

[astro -ph/01051 13]. 

[29] P. J. E. Peebles, Statistical Analysis of Catalogs of Extragalactic Objects. I. Theory, Astrophys. 
J. 185 (Oct., 1973) 413-440. 

[30] I. Szapudi, S. Prunet, and S. Golombi, Fast Analysis of Inhomogenous Megapixel Cosmic 
Microwave Background Maps, Astrophys. J. 561 (Nov., 2001) L11-L14, ]astro-ph/0107383]. 

[31] G. Ghon, A. Ghallinor, S. Prunet, E. Hivon, and I. Szapudi, Fast estimation of polarization 
power spectra using correlation functions, Mon. Not. Roy. Astron. Soc. 350 (May, 2004) 
914-926, ]astro-ph/0303414]. 

[32] Thomas, S A and Abdalla, F B and Lahav, O, The angular power spectra of photometric sloan 
digital sky survey luminous red galaxies, Mon. Not. Roy. Astron. Soc. 412 (2011), no. 3 
1669-1685, [arXiv:1011.2448]. 

[33] DES collaboration, M. R. Becker, et al.. Cosmic Shear Measurements with DES Science 
Verification Data, arXiv: 1507.05598. 

[34] J. R. Pritchard and A. Loeb, 21 cm cosmology in the 21st century. Reports on Progress in 
Physics 75 (Aug., 2012) 086901, [corXiv: 1109. 6012]. 

[35] S. M. Rao and D. A. Turnshek, Discovery of DampedLya Systems at Redshifts Less than 1.65 
and Results on Their Incidence and Cosmological Mass Density, Astrophys. J. Suppl. S. 130 
(Sept., 2000) 1-35, ]astro-ph/9909164]. 

[36] S. M. Rao, D. A. Turnshek, and D. B. Nestor, Damped Lya Systems at z < 1.65.' The 
Expanded Sloan Digital Sky Survey Hubble Space Telescope Sample, Astrophys. J. 636 (Jan., 
2006) 610-630, ]astro-ph/0509469]. 

[37] P. Lah, J. N. Ghengalur, F. H. Briggs, M. Golless, R. De Propris, M. B. Pracy, W. J. G. 

De Blok, S. S. Fujita, M. Ajiki, Y. Shioya, T. Nagao, T. Murayama, Y. Taniguchi, M. Yagi, 
and S. Okamura, The H I content of star-forming galaxies at z= 0.24, Mon. Not. Roy. Astron. 
Soc. 376 (Apr., 2007) 1357-1366, ]astro-ph/0701668]. 

[38] P. Lah, M. B. Pracy, J. N. Ghengalur, F. H. Briggs, M. Golless, R. De Propris, S. Ferris, B. P. 
Schmidt, and B. E. Tucker, The H I gas content of galaxies around Abell 370, a galaxy cluster 
at z = 0.37, Mon. Not. Roy. Astron. Soc. 399 (Nov., 2009) 1447-1470, [arXiv: 0907. 1416]. 


- 23 


[39] A. M. Martin, E. Papastergis, R. Giovanelli, M. P. Haynes, C. M. Springob, and S. Stierwalt, 
The Arecibo Legacy Fast ALFA Survey. X. The FI I Mass Function and LIhi from the 40% 
ALFALFA Survey, Astrophys. J. L. 723 (Oct., 2010) 1359-1374, [arXiv: 1008.5107]. 

[40] M. A. Zwaan, M. J. Meyer, L. Staveley-Smith, and R. L. Webster, The FLIPASS catalogue: 
and environmental effects on the HI mass function of galaxies, Mon. Not. Roy. Astron. Soc. 
359 (May, 2005) L30-L34, ]astro-ph/0502257]. 

[41] H. Padmanabhan, T. R. Choudhury, and A. Refregier, Theoretical and observational 
constraints on the hi intensity power spectrum, Mon. Not. Roy. Astron. Soc. 447 (Mar., 2015) 
3745-3755, [arXiv;1407.6366]. 

[42] A. Font-Ribera, J. Miralda-Escude, E. Arnau, B. Carithers, K. G. Lee, P. Noterdaeme, I. Paris, 
P. Petitjean, J. Rich, E. Rollinde, N. P. Ross, D. P. Schneider, M. White, and D. G. York, The 
large-scale cross-correlation of Damped Lyman alpha systems with the Lyman alpha forest: first 
measurements from BOSS, J. Cosmol. Astropart. Phys. 11 (Nov., 2012) 59, [arXiv; 1209.4596]. 

[43] R. Ansari, J. E. Gampagne, P. Golom, J. M. Le Goff, G. Magneville, J. M. Martin, M. Moniez, 
J. Rich, and G. Yeche, 21 cm observation of large-scale structures at ~ 1. instrument 
sensitivity and foreground subtraction, Astronom. and Astrophys. 540 (Apr., 2012) A129, 
[arXiv; 1108.1474]. 

[44] L. Wolz, F. B. Abdalla, G. Blake, J. R. Shaw, E. Ghapman, and S. Rawlings, The effect of 
foreground subtraction on cosmological measurements from intensity mapping, Mon. Not. Roy. 
Astron. Soc. 441 (May, 2014) 3271-3283, [arXiv; 1310.8144]. 

[45] D. Alonso, P. Bull, P. G. Ferreira, and M. G. Santos, Blind foreground subtraction for intensity 
mapping experiments, Mon. Not. Roy. Astron. Soc. 447 (Feb., 2015) 400-416, 

[arXiv; 1409.8667]. 

[46] M. A. Bigot-Sazy, G. Dickinson, R. A. Battye, I. W. A. Browne, Y. Z. Ma, B. Maffei, 

F. Noviello, M. Remazeilles, and P. N. Wilkinson, Simulations for single-dish intensity mapping 
experiments, ArXiv e-prints (July, 2015) [arXiv; 1507.04561]. 

[47] J. Asorey, M. Grocce, E. Gaztanaga, and A. Lewis, Recovering 3D clustering information with 
angular correlations, Mon. Not. Roy. Astron. Soc. 427 (Nov., 2012) 1891-1902, 

[arXiv; 1207.6487]. 

[48] A. Nicola, A. Refregier, A. Amara, and A. Paranjape, Three-dimensional spherical analyses of 
cosmological spectroscopic surveys, Phys. Rev. D 90 (Sept., 2014) 063515, [curXiv; 1405.3660]. 

[49] F. Lanusse, A. Rassat, and J. L. Starck, 3D galaxy clustering with future wide-field surveys: 
Advantages of a spherical Fourier-Bessel analysis, Astronom. and Astrophys. 578 (May, 2015) 
AlO, [arXiv; 1406.5989]. 

[50] M. Tegmark and A. de Oliveira-Gosta, How to measure CMB power spectra without losing 
information, Phys. Rev. D 64 (Sept., 2001) 063001, ]astro-ph/0012120]. 

[51] J. R. Bond, A. H. Jaffe, and L. Knox, Estimating the Power Spectrum of the Cosmic 
Microwave Background, Phys. Rev. D 57 (Feb., 1998) 2117-2137, ]astro-ph/9708203]. 

[52] 1. Szapudi, S. Prunet, D. Pogosyan, A. S. Szalay, and J. R. Bond, Fast Cosmic Microwave 
Background Analyses via Correlation Functions, Astrophys. J. 548 (Feb., 2001) L115-L118, 
[astro-ph/0010256]. 

[53] E. Hivon, K. M. Gorski, G. B. Netterfield, B. P. Grill, S. Prunet, and F. Hansen, MASTER of 
the Cosmic Microwave Background Anisotropy Power Spectrum: A Fast Method for Statistical 
Analysis of Large and Complex Cosmic Microwave Background Data Sets, Astrophys. J. 567 
(Mar., 2002) 2-17, ]astro-ph/0105302]. 

[54] G. Efstathiou, Myths and truths concerning estimation of power spectra: the case for a hybrid 
estimator, Mon. Not. Roy. Astron. Soc. 349 (Apr., 2004) 603-626, ]astro-ph/0307515]. 


- 24 - 


[55] C. Bonvin and R. Durrer, What galaxy surveys really measure, Phys. Rev. D 84 (Sept., 2011) 
[arXiv; 1105.5280], 

[56] A. Challinor and A. Lewis, Linear power spectrum of observed source number counts, Phys. 

Rev. D 84 (Aug., 2011) 043516, [arXiv: 1105. 5292]. 

[57] M. Eriksen and E. Gaztanaga, Combining spectroscopic and photometric surveys using angular 
cross-correlations - I. Algorithm and modelling, Mon. Not. Roy. Astron. Soc. 452 (July, 2015) 
2149-2167, [arXiv; 1412. 2208]. 

[58] M. Tegmark and G. Efstathiou, A method for subtracting foregrounds from multifrequency CMB 
sky maps, Mon. Not. Roy. Astron. Soc. 281 (Aug., 1996) 1297-1314, ]astro-ph/9507009]. 

[59] V. Springel, N. Yoshida, and S. D. M. White, GADGET: a code for collisionless and 
gasdynamical cosmological simulations. New Astron. 6 (Apr., 2001) 79-117, 

[astro-ph/0003162]. 

[60] V. Springel, The cosmological simulation code GADGET-2, Mon. Not. Roy. Astron. Soc. 364 
(Dec., 2005) 1105-1134, ]astro-ph/0505010]. 

[61] M. T. Busha, R. H. Wechsler, M. R. Becker, B. Erickson, and A. E. Evrard, Catalog Production 
for the DES Blind Cosmology Challenge, in American Astronomical Society Meeting Abstracts, 
vol. 221 of American Astronomical Society Meeting Abstracts, p. 341.07, Jan., 2013. 

[62] B. Erickson, A. E. Evrard, R. Singh, S. Marru, M. Pierce, M. R. Becker, A. Kravtsov, M. T. 
Busha, R. H. Wechsler, P. M. Ricker, and des Simulations Working Group, A High Throughput 
Workflow Environment for Cosmological Simulations, in American Astronomical Society 
Meeting Abstracts, vol. 221 of American Astronomical Society Meeting Abstracts, p. 352.21, 
Jan., 2013. 

[63] DES collaboration, V. Vikram, et ah, Wide-field tensing mass maps from Dark Energy Survey 
science verification data: Methodology and detailed analysis, Phys. Rev. D 92 (July, 2015) 
022006, [arXiv; 1504. 03002]. 

[64] DES collaboration, G. Bonnett, et ah, Redshift distributions of galaxies in the DES Science 
Verification shear catalogue and implications for weak tensing, arXiv; 1507.05909. 

[65] DES collaboration, B. Leistedt, et ah. Mapping and simulating systematics due to 
spatially-varying observing conditions in DES Science Verification data, arXiv; 1507.05647. 

[66] DES collaboration, Y. Park, et ah. Joint Analysis of Galaxy-Galaxy Tensing and Galaxy 
Clustering: Methodology and Forecasts for DES, arXiv; 1507.05353. 

[67] K. M. Gorski, E. Hivon, A. J. Banday, B. D. Wandelt, F. K. Hansen, M. Reinecke, and 

M. Bartelmann, Healpix: A framework for high-resolution discretization and fast analysis of 
data distributed on the sphere, Astrophys. J. 622 (Apr., 2005) 759-771, ]astro-ph/0409513]. 

[68] S. de la Torre and J. A. Peacock, Reconstructing the distribution of haloes and mock galaxies 
below the resolution limit in cosmological simulations, Mon. Not. Roy. Astron. Soc. 435 (Oct., 
2013) 743-748, [arXiv;1212.3615]. 

[69] F. Bernardeau, The nonlinear evolution of rare events, Astrophys. J. 427 (May, 1994) 51-71, 
[astro -ph/931 1066] . 

[70] J. R. Bond, S. Gole, G. Efstathiou, and N. Kaiser, Excursion set mass functions for 
hierarchical Gaussian fluctuations, Astrophys. J. 379 (Oct., 1991) 440-460. 

[71] R. K. Sheth and G. Lemson, Biasing and the distribution of dark matter haloes, Mon. Not. 

Roy. Astron. Soc. 304 (Apr., 1999) 767-792, ]astro-ph/9808138]. 

[72] A. Raccanelli, P. Bull, S. Gamera, G. Blake, P. Ferreira, R. Maartens, M. G. Santos, D. Bacon, 
O. Dore, M. Viel, and G. B. Zhao, Measuring redshift-space distortion with future SKA surveys. 


- 25 


Advancing Astrophysics with the Square Kilometre Array (AASKA14-) (2015) 31, 

[arXiv; 1501.03821], 

[73] D. S. Reed, R. Bower, C. S. Frenk, A. Jenkins, and T. Theuns, The halo mass function from 
the dark ages through the present day, Mon. Not. Roy. Astron. Soc. 374 (Jan., 2007) 2-15, 
[astro-ph/0607150] . 

[74] M. Crocce, P. Fosalba, F. J. Castander, and E. Gaztanaga, Simulating the Universe with 
MICE: the abundance of massive clusters, Mon. Not. Roy. Astron. Soc. 403 (Apr., 2010) 
1353-1367, [arXiv;0907.0019]. 

[75] W. A. Watson, I. T. Iliev, A. D’Aloisio, A. Knebe, P. R. Shapiro, and G. Yepes, The halo mass 
function through the cosmic ages, Mon. Not. Roy. Astron. Soc. 433 (Aug., 2013) 1230-1245, 
[arXiv; 1212.0095]. 

[76] N. Khandai, T. Di Matteo, R. Groft, S. Wilkins, Y. Feng, E. Tucker, G. DeGraf, and M.-S. Liu, 
The MassiveBlack-II simulation: the evolution of haloes and galaxies to zO, Mon. Not. Roy. 
Astron. Soc. 450 (June, 2015) 1349-1374, [arXiv; 1402.0888]. 

[77] W. A. Hellwing, G. S. Frenk, M. Gautun, S. Bose, J. Helly, A. Jenkins, T. Sawala, and 

M. Gytowski, The Copernicus Complexio: a high-resolution view of the small-scale Universe, 

arXiv;1505.06436. 

[78] M. Manera and E. Gaztanaga, The local bias model in the large-scale halo distribution, Mon. 
Not. Roy. Astron. Soc. 415 (July, 2011) 383-398, ]cLrXiv;0912.0446]. 

[79] D. Bias, J. Lesgourgues, and T. Tram, The Cosmic Linear Anisotropy Solving System 
(CLASS). Part II: Approximation schemes, J. Cosmol. Astropart. Phys. 2011 (July, 2011) 
034-034, [arXiv; 1104.2933]. 

[80] E. Di Dio, F. Montanari, J. Lesgourgues, and R. Durrer, The CLASSgal code for relativistic 
cosmological large scale structure, J. Cosmol. Astropart. Phys. 2013 (Nov., 2013) 044-044, 
[arXiv; 1307.1459]. 

[81] R. E. Smith, J. A. Peacock, and A. Jenkins, Stable clustering, the halo model and non-linear 
cosmological power spectra, Mon. Not. Roy. Astron. Soc. 341 (June, 2003) 1311-1332, 

[astro-ph/0207664]. 

[82] R. Takahashi, M. Sato, T. Nishimichi, A. Taruya, and M. Oguri, Revising the Halofit Model for 
the Nonlinear Matter Power Spectrum, Astrophys. J. 761 (Dec., 2012) 152, [arXiv; 1208.2701]. 

[83] N. Kaiser, Clustering in real space and in redshift space, Mon. Not. Roy. Astron. Soc. 227 
(July, 1987) 1-21. 

[84] N. Padmanabhan, D. J. Schlegel, U. Seljak, A. Makarov, N. A. Bahcall, M. R. Blanton, 

J. Brinkmann, D. J. Eisenstein, D. P. Finkbeiner, J. E. Gunn, D. W. Hogg, Z. Ivezic, G. R. 
Knapp, J. Loveday, R. H. Lupton, R. G. Nichol, D. P. Schneider, M. A. Strauss, M. Tegmark, 
and D. G. York, The clustering of luminous red galaxies in the Sloan Digital Sky Survey 
imaging data, Mon. Not. Roy. Astron. Soc. 378 (July, 2007) 852-872, [astro-ph/0605302[. 

[85] L. Knox, Determination of inflationary observables by cosmic microwave background anisotropy 
experiments, Phys. Rev. D 52 (Oct., 1995) 4307-4318, [astro-ph/9504054[. 

[86] A. Gabre, P. Fosalba, E. Gaztanaga, and M. Manera, Error analysis in cross-correlation of sky 
maps: application to the Integrated Sachs- Wolfe detection, Mon. Not. Roy. Astron. Soc. 381 
(Nov., 2007) 1347-1368, [astro-ph/0701393[. 

[87] F. Villaescusa-Navarro, S. Planelles, S. Borgani, M. Viel, E. Rasia, G. Murante, K. Dolag, 

L. K. Steinborn, V. Biffi, A. M. Beck, and G. Ragone-Figueroa, Neutral hydrogen in galaxy 
clusters: impact of ACN feedback and implications for intensity mapping, Mon. Not. Roy. 
Astron. Soc. 456 (Mar., 2016) 3553-3570, [arXiv; 1510.04277]. 


- 26 - 


[88] M. P. Haynes, R. Giovanelli, A. M. Martin, K. M. Hess, A. Saintonge, E. A. K. Adams, 

G. Hallenbeck, G. L. Hoffman, S. Huang, B. R. Kent, R. A. Koopmann, E. Papastergis, 

S. Stierwalt, T. J. Balonek, D. W. Graig, S. J. U. Higdon, D. A. Kornreich, J. R. Miller, A. A. 
O’Donoghue, R. P. Olowin, J. L. Rosenberg, K. Spekkens, P. Troischt, and E. M. Wilcots, The 
Areciho Legacy Fast ALFA Survey: The a.40 F[ L Source Catalog, Lts Characteristics and Their 
Impact on the Derivation of the H I Mass Function, Astron. J. 142 (Nov., 2011) 170, 

[arXiv; 1109.0027], 

[89] E. Papastergis, R. Giovanelli, M. P. Haynes, A. Rodn'guez-Puebla, and M. G. Jones, The 
Clustering of ALFALFA Calaxies: Dependence on H I Mass, Relationship with Optical 
Samples, and Clues of Host Halo Properties, Astrophys. J. 776 (Oct., 2013) 43, 

[arXiv; 1308.2661]. 

[90] P. S. Behroozi, R. H. Wechsler, and H.-Y. Wu, The ROCKSTAR Phase-space Temporal Halo 
Finder and the Velocity Offsets of Cluster Cores, Astrophys. J. 762 (Jan., 2013) 109, 

[arXiv; 1110.4372]. 

[91] V. Desjacques and R. K. Sheth, Redshift space correlations and scale-dependent stochastic 
biasing of density peaks, Phys. Rev. H 81 (Jan., 2010) 023526, [arXiv: 0909.4544]. 


A Sub-grid halo sampling 

In this Appendix, we compare our prescription for sampling halos from the density field with 
the resolved halos in the simulation. Our sub-grid model involves the linearization of the non¬ 
linear density field [69] and the sampling of halos from the conditional mass function from [28] . 
In the following, we first analyze the linearization before proceeding to a comparison between 
the resolved and sub-sampled halo populations. 

A.l Linearization 

Figure 10 shows the distribution of overdensities in the dark matter maps before and after 
linearization for the lowest and highest redshift bin. We expect the linearized overdensities to 
follow a Gaussian distribution of mean zero, but the variance of the Gaussian depends on the 
filtering of the field by our pixels and is hence hard to predict from the linear power spectrum. 
We therefore also plot a Gaussian with the same mean and variance as the distribution of 
linearized over densities. We can see in Figure 10 that the agreement is reasonable, but the 
distribution of overdensities is shifted (means of —0.27 and —0.09 for the low and high redshift 
bin) and skewed (skewness of —0.14 and —0.10 for the low and high redshift bin) towards 
negative overdensities in particular for the lowest redshift bins. This means that equation (3.4) 
fails at correctly reproducing the high overdensity tail of the linearized field for our pixels. 

Figure 11 shows the angular power spectrum of the non-linear and linearized field as 
compared to the respective predictions from linear theory and halo fit corrections. We can see 
that, overall, the correlations of the linearized field follow the linear theory prediction. The 
bottom panel shows, however, that the linearized field has deviations of around 15% when 
compared to its prediction while the non-linear field agrees with the non-linear prediction to 
better than 5%. 

A.2 Comparison of halo populations 

In this section, we compare the resolved halos in the simulation to the halos sampled from 
the density field of the same simulation with our procedure. We used rockstar [90] for finding 
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z = 0.369 z = 0.863 



Figure 10. Histogram of non-linear (blue) and linearized (green) matter overdensities in our maps 
at the lowest (z ~ 0.369, left) and highest (z ~ 0.863, right) redshift. The non-linear overdensities 
have a tail towards high overdensities and are, by definition, cut off at an underdensity of -1. The 
distribution of the linearized overdensities is overplotted with a Gaussian distribution of the same 
mean and variance (black line). We can see that the linearized overdensities approximately follow 
the Gaussian distribution, but are shifted and skewed to negative values in particular for the lowest 
redshift bin. 



Figure 11. Angular power spectrum of non-linear (blue) and linearized (green) matter overdensities. 
The spectrum of the linearized field and is supposed to follow the linear theory power spectrum (black 
dashed line). We can see that, in particular for the lowest redshift bin, the estimated and predicted 
linear power spectra deviate by around 15%. 


halos in the lightcones of the simulation and use the mass M 2 oo,c within the region where the 
density is 200 times higher than the critical density as halo mass. 

Figure 12 shows the agreement between the number of halos within a given mass range 
as observed from the population of resolved and sampled halos as well as the distribution 
predicted from the unconditional mass function from reference [28]. The simulations have a 
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Figure 12. Comparison of the number of halos within a given mass bin from the simulation (blue 
histograms) and the sampled halos (red histogram) in the lowest (left) and highest (right) redshift 
bin. The green line shows the prediction from the unconditional mass function [28]. We can see that 
the halo catalog from the simulation is complete for halos with mass M 2 oo,c > lO^^h“^M 0 . The 
distribution of the sampled halos agrees with the resolved halos for masses close to this limit, but 
deviates for the highest masses in particular for the low redshift bin. 

mass resolution of 1.6 x 10 ^^ h“^ Mq and the distribution of resolved halos is therefore not 
complete for masses below approximately 10^^h“^ -^©- The exponential cut-off scale in the HI 
assignment, however, is at roughly 10 ^^h“^ Mq so in our model the resolved halos would not 
contribute significantly to the overall HI distribution. Close to -TIq, the resolved and 

sampled halo populations indeed agree to about 10%. The high mass tail of the sampled halo 
distribution at low redshifts, however, diverges from the resolved population. This is most 
likely due to the shortcomings of the linearization process at the high overdensities discussed 
in section A.l. As these halos do not contain significant amounts of HI in our prescription, 
this mismatch is however not critical for our application. At the highest redshift bin our 
method seems to extend even to this high mass end. 

In Figure 13 we show the distribution of pixel masses in the matter density and the halo 
field at the highest redshift. The halos are either the resolved halos from the simulations (filled 
contours) or the sampled halos (lines). The plot shows that both the distribution and the 
scatter in the halo mass distribution of the sampled halos is very similar to the distribution 
of the resolved halos. There are mild differences in the scatter that are most likely related to 
the simple Poisson sampling scheme. For the same reason, some of the pixels slightly violate 
mass conservation. None of these shortcomings are expected to affect the halos within the 
mass range relevant for our intensity maps. 

A.3 Peculiar velocities 

When assigning the halos to pixels, one has to take into account that, depending on the 
line-of-sight velocities of the halos and the position of the halos within the radial extension 
of the bin, the 21 cm line in some of the halos will shift into the neighboring frequency bins. 
The bin-width of our maps corresponds to a frequency bandwidth of A/ = 25 MHz. For a 
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Figure 13. Comparison of mass distributions in pixels for the density field and the halos in the 
highest redshift bin. We only consider halos with mass greater than Mq. The filled contours 

show the distribution of the resolved halos and the line contours show the distributions of the sampled 
halos. Note that the plot is not showing the pixels which contain no halos. 


halo at the center of the bin to escape the bin, it would thus require velocities larger than 
3 X 10^kms“^ which are rarely achieved. At the boundaries of the bin, however, the redshift 
space distortions (RSDs) due to peculiar velocities are distorting the shape of this boundary. 

To model the effect of RSDs on our maps, we start, as before, with dark matter maps for 
which we assign each particle of the N-body simulation within the lightcone to the bin which 
corresponds to its position in redshift according to its comoving distance. In addition to the 
number rnj of particles within pixel i of redshift bin j, we now additionally count the number 
nij{k) of particles within this volume that fall into redshift bin k when furthermore taking 
their line of sight velocity into account. The number of particles within pixel i of redshift 
bin k of the distorted dark matter maps (including RSDs) is then given by hij{k). 

Only a small fraction of the particles gets redistributed by this procedure, but due to the 
coherence of the effect on large scales, even these mild changes enhance the large scale power 
spectrum of the field [83] . We created the distorted dark matter maps from four independent 
simulations and Figure 5 shows a comparison of the angular power spectra as estimated 
from those maps with (thin green line) and without (thick green line) RSDs. The top panel 
of Figure 14 furthermore shows that the enhancement of the large scale correlations in the 
distorted maps is as expected from the theoretical prediction by CLASS (see [80] for details 
on RSDs in CLASS). 

Our approach for assigning halos to pixels is based on the insight that the number of 
halos within each pixel can be modeled using the conditional mass function of reference [28]. 
The procedure however does not model the positions of the halos within the pixels or their 
velocities. Assuming that the halo velocities are unbiased with respect to the dark matter flow 
(see for example [91] for a discussion), we can however use the information about the distorted 
dark matter particle distribution hij{k) in order to imprint the RSDs on the HI intensity maps. 
Starting from the unperturbed dark matter density field, we follow the procedure described in 
section 3 to assign an HI mass to pixel i of redshift bin j. Instead of adding the complete 
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Figure 14. Comparison of the angular power spectra of the redshift space distorted maps with 
theoretical predictions with (blue) and without (green) redshift space distortions for the lowest (left) 
and highest (right) redshift bins. The dashed black line shows the ratio of both theoretical predictions. 
The top panels show the results for the matter density field and the middle panels show the results for 
the intensity maps. The prediction for the intensity maps assume a scale independent bias that does 
not evolve within the redshift bins and is derived from the unconditional mass function in [28]. The 
bottom panel shows the ratio of the spectrum estimated from the intensity maps and the spectrum 
predicted for dark matter, i.e. it shows the biasing of the intensity maps with respect to the matter 
maps. The solid line shows the predicted large scale bias and the dashed line shows the ratio between 
the theoretical prediction for the intensity maps and the matter maps. 


HI mass to this pixel, we however distribute the mass over all redshifts according to the 
fraction hij{k)/nij of dark matter particles that got redistributed due to their line of sight 
velocity. In pixel i of redshift bin k, the distorted HI maps consequently contain a mass of 

ik)/nij . (A. 1) 

j 

Figure 5 shows the angular power spectrum of the intensity maps with (thin blue line) 
and without (thick blue line) RSDs. As expected, the power on large scales is enhanced. In 
the middle panel of Figure 14 it can be seen that the power spectrum follows the theoretical 
prediction for a biased tracer with unbiased velocities. The bottom panel furthermore shows 
the agreement of the large scale bias with the analytical predictions. 

Using the intensity maps from four realizations of the N-body simulation, we furthermore 
studied if the RSDs affect the agreement between the Gaussian estimate and the estimate from 
the independent realizations for variance and covariance of the HI angular power spectra. Fig¬ 
ure 15 shows the results for the variance \ax{Ci) and averaged correlations (Corr(C'£, 
between small scales (600 < i < 700, see also section 5.2 for more discussion of the averaged 
correlation). The left plot shows that, within the noise of the estimators, the variance follows 
the Gaussian prediction from PolSpice for both the simulations with and without RSDs. We 
verified that the same is true for the correlation matrix at high redshifts or large scales. As 
for the simulations without RSDs, we find an excess of correlations between neighboring i 
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Figure 15. Variance (left) and correlations (right) of the angular power spectra of the dark matter 
field with (green) and without (blue) RSDs as estimated from our analysis of four realizations of the 
simulations. The corresponding estimates for Gaussian random fields from PolSpice are shown in 
black. The bottom panel of the variance plot shows the ratio of the estimate from the simulations 
to the PolSpice predictions. To reduce the noise in the correlation matrix estimate, we show the 
correlations between neighboring £ values averaged over 600 < £ < 700. 


scales for high £ values (600 < £ < 700). The right plot of Figure 15 however shows that, 
again within the noise of the estimate, the magnitude of the excess is broadly consistent with 
the estimate from the maps without RSDs. 

B Stochastic field models 

In this section we analyze the properties of pseudo Ci and PolSpice estimators when applied 
to masked random helds. We will compare the results from many realizations of maps with 
the same power spectrum to the results from both Gaussian held predictions and resampling 
methods. The mask is chosen such that it matches the mask which we also impose on our 
simulated intensity maps. We will use two types of random helds: Gaussian helds and log¬ 
normal helds. 

B.l Gaussian fields 

As input power spectrum, we use the angular power spectrum for redshift z ~ 0.863 with 
a bin width of Az ~ 0.06 as calculated from linear theory with haloht corrections using 
GLASS [79, 80]. The highest redshift map is closest to a Gaussian random held and closest to 
the linear theory prediction, so this analysis is used as a mock for the high redshift behavior 
of the covariance. We rescale each density held to an intensity map using equation (3.1). 
To estimate the variance and covariance of the angular power spectrum estimator, we apply 
jackknife and the predictions for Gaussian helds to 100 random realizations of the held. We 
compare these results to the values inferred from 10^ random realizations from the HEALPix 
routine synfast. 

Gonsidering the diagonal of the covariance matrix = Gov(C'£, C^), the left panel of 

Figure 16 shows a comparison of estimates from Gaussian helds together with the results 
from the 10^ realizations of a Gaussian random held with the same input power spectrum. 
The pseudo Ci estimator overestimates a{C() by a factor of ~ 2, while the prediction from 
PolSpice agrees well with the result from the 10“^ realizations. The pseudo Ci estimate for the 
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Figure 16. Standard deviation a{C() of the angnlar power spectrum from multiple realizations of the 
masked Gaussian field (black) as compared to the pseudo Ci (blue) and PolSpice (green) estimators for 
Gaussian fields (left panel) and to different jackknife configurations (right panel). While the PolSpice 
estimate is consistent with the result from 10'^ realizations, the pseudo Ct estimate overestimates 
(j{Ci) by a factor of ^ 2 to compensate for the lack of correlations. As demonstrated in section 5.2 
for the simulated intensity maps, this mismatch can be reduced by considering bandpowers that are 
binned in The jackknife error estimates become more and more biased to higher values as njack 
increases, while the variance of the error estimate decreases. 


error is getting closer to the correct value if we consider the variance of bandpowers that are 
averaged over bins in This however comes at the cost of loosing more and more information 
on the spectrum itself due to the averaging. 

The right panel of Figure 16 shows the results for the jackknife variance estimates. We 
use the PolSpice estimator for estimating the angular power spectrum of the maps, but the 
results are similar when using pseudo Ci values. To create the jackknife sample, we mask 
patches that correspond to pixels in coarser HEALPix maps. Each color in the right panel 
of Figure 16 refers to a different patch size. The number of jackknives njack given in the 
legend is inversely proportional to the size of the patch. We can see that the error estimate 
with the smallest bias is coming from the jackknife with largest patch size. The higher we 
go in njack—he. the lower in patch size—the smaller is the noise on the estimate but the 
larger is the disagreement between the jackknife estimate and the repeated simulations. To 
compromise between variance and bias of the jackknife estimate, we chose njack = 48 as our 
jackknife conhguration for the remaining analysis. For a jackknife with njack = 48, is 

biased to higher values by about 40%. 


B.2 Log-normal fields 

We want to use a log-normal held with correlations that are close to the prediction from the 
intensity map of the lowest redshift slice {z ~ 0.366 ± 0.026) in order to approximate its 
non-Gaussian nature. As we cannot easily simulate a log-normal held with specihed power 
spectrum, we start from a Gaussian held created from an input power spectrum given by the 
linear theory prediction at this redshift. When exponentiating a Gaussian held X with mean 
fj, = 0 and variance cr^, the resulting held is log-normally distributed with mean exp(iT^/2) and 
variance (exp((T^) — 1) exp(cr^). The following held Y can hence be shown to be log-normally 
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Figure 17. Standard deviation fj{Ci) of the angular power spectrum from multiple realizations of 
the masked log-normal field as compared to predictions from Gaussian field statistics for pseudo Ci 
and PolSpice estimators. While the pseudo Ci is again biased to higher a{Ct) by a factor of 2, the 
PolSpice estimator only shows a mild deviation of a view percent at small scales (inset in the top 
right corner). 
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Figure 18. Correlations of the angular power spectrum from multiple realizations of the masked 
log-normal field as compared to estimates from jackknife and Gaussian field statistics for pseudo Cgs 
(left) and PolSpice (right) for large scales [i = 25, top) and small scales [t = 650, bottom). 


distributed with variance (t|, and mean equal to T: 
Y = T exp 


Vlog((cTT/r)2 + l)^ _ 1 ^ 

(7 Z I 


(B.l) 


We use the map as given by Y as an approximation for the desired intensity map, but need 
to be aware that its angular power spectrum is not necessarily consistent with the input. 






























0<^<100 


600 <^<700 



Figure 19. Mean correlation (Corr(C^, for the PolSpice angnlar power spectrum of the 

masked log-normal field, averaged over 100 is. We show estimates from the multiple realizations, the 
jackknife, and the predictions for Gaussian fields. Correlations between large scales {0 < i < 100) are 
shown on the left, small scales (600 < £ < 700) on the right. 


As shown by Figure 17, the variances as predicted using Gaussian field statistics for both 
pseudo C£ and PolSpice are affected by the change in the underlying distribution only very 
mildly at small scales. As expected, the jackknife results are independent of the distribution 
and are equivalent to the ones shown in section B.l. 

We now turn our attention to the correlation matrix Corr(C£, O^') = ■ Fig¬ 

ure 18 shows the correlations between individual i values at large scales around £ = 25 and 
small scales around £ = 650. As expected, the large scale part of the correlation matrix is not 
affected by the non-Gaussianity as the perturbations on large scales are still approximately 
Gaussian. Looking at the correlations between individual £ values on small scales, however, 
we find small amounts of extra correlations introduced by the log-normal field for scales sep¬ 
arated hy 5£ > 5. Due to the assumption of an isotropic Gaussian field, the extra correlations 
are not estimated correctly by PolSpice. The jackknife estimator, however, is able to pick 
these correlations up for 5£ > 15 while it is biased to negative correlations for 5 < 5£ < 15. 
The variance of the jackknife estimator (shown by the shaded area) is too large to resolve 
the extra correlations when estimated from a single realization of the field. Figure 19 hence 
shows the correlation Gorr(C'£, C^+^f) as a function of 5£ when averaged over 100 neighboring 
£ values. Again, there are additional correlations at small scales (600 < £ < 700, right panel), 
but this time the noise in the jackknife estimate is just about small enough to detect those 
correlations (for 5£ > 15) even from a single realization. 
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